BinChill: A Metagenomic Binning Ensemble Method
نویسندگان
چکیده
The goal of metagenomic binning is to reconstruct genomes from a mixture DNA sequences into genomic bins, which can be considered clustering task. Multiple methods have been proposed for this task, such as distance-based metrics, machine learning, and ensemble approaches. We propose BinChill, method, based on the generic co-occurrence ensembler ACE. BinChill incorporates domain information in form Single-Copy Genes (SCG) with strategy. This strategy combines multiple partitions according how often two items co-occur same cluster. was able more or equally many high- medium quality while having an equal faster runtime than other metagenomics-specific smaller simulated dataset. On larger datasets, both real-world, outperformed reconstructing high-quality at cost increased processing time when compared algorithms. due domain-specific steps that our method implements. Our results show strengths combined generate partition higher quality.
منابع مشابه
Metagenomic reads binning with spaced seeds
Article history: Received 23 February 2017 Received in revised form 16 May 2017 Accepted 21 May 2017 Available online xxxx
متن کاملMetagenomic binning through low density hashing
Bacterial microbiomes of incredible complexity are found throughout the world, from exotic marine locations to the soil in our yards to within our very guts. With recent advances in Next-Generation Sequencing (NGS) technologies, we have vastly greater quantities of microbial genome data, but the nature of environmental samples is such that DNA from different species are mixed together. Here, we...
متن کاملLow-Density Locality-Sensitive Hashing Boosts Metagenomic Binning.
Metagenomic binning is an essential task in analyzing metagenomic sequence datasets. To analyze structure or function of microbial communities from environmental samples, metagenomic sequence fragments are assigned to their taxonomic origins. Although sequence alignment algorithms, such as BWA, Bowtie or BLAST, can readily be used and usually provide high-resolution alignments and accurate binn...
متن کاملA new ensemble clustering method based on fuzzy cmeans clustering while maintaining diversity in ensemble
An ensemble clustering has been considered as one of the research approaches in data mining, pattern recognition, machine learning and artificial intelligence over the last decade. In clustering, the combination first produces several bases clustering, and then, for their aggregation, a function is used to create a final cluster that is as similar as possible to all the cluster bundles. The inp...
متن کاملA Novel Abundance-Based Algorithm for Binning Metagenomic Sequences Using l-Tuples
Metagenomics is the study of microbial communities sampled directly from their natural environment, without prior culturing. Among the computational tools recently developed for metagenomic sequence analysis, binning tools attempt to classify the sequences in a metagenomic dataset into different bins (i.e., species), based on various DNA composition patterns (e.g., the tetramer frequencies) of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2023
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2023.3277755